AITopics | distribution family

Density estimation in high-dimensional settings is an important and challenging statistical problem.Traditional methods based on kernel smoothing are inefficient in high dimensions due to the difficulties in specifying appropriate location-adaptive kernels. In this work, we introduce pre-training, a key idea behind many cutting-edge AI technologies, to the context of non-parametric density estimation. By establishing a pre-trained neural network that can recommend an appropriate location-adaptive kernel for each sample point, efficient density estimation with adaptive kernels is achieved in high dimensions. A wide range of numerical experiments show that this strategy is highly effective for improving density-estimation accuracy, when the target distribution is close to the distribution family for pre-training. When the target distribution is substantially different from the pre-training distribution family, the benefit from the proposed pre-training strategy may be diluted, but can be reactivated by an additional fine-tuning procedure.

artificial intelligence, density estimation, machine learning, (17 more...)

arXiv.org Machine Learning

2605.13092

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Testing for Families of Distributions via the Fourier Transform

Neural Information Processing SystemsMar-16-2026, 23:31:11 GMT

We study the general problem of testing whether an unknown discrete distribution belongs to a specified family of distributions. More specifically, given a distribution family P and sample access to an unknown discrete distribution D, we want to distinguish (with high probability) between the case that D in P and the case that D is ε-far, in total variation distance, from every distribution in P . This is the prototypical hypothesis testing problem that has received significant attention in statistics and, more recently, in computer science. The main contribution of this work is a simple and general testing technique that is applicable to all distribution families whose Fourier spectrum satisfies a certain approximate sparsity property. We apply our Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables (SIIRVs), Poisson Multinomial Distributions (PMDs), and Discrete Log-Concave Distributions. For the first two, ours are the first non-trivial testers in the literature, vastly generalizing previous work on testing Poisson Binomial Distributions. For the third, our tester improves on prior work in both sample and time complexity.

artificial intelligence, machine learning, proceedings, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

6aaba9a124857622930ca4e50f5afed2-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 02:43:44 GMT

inference, optimization, variational distribution, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MIST: Mutual Information Via Supervised Training

Gritsai, German, Richards, Megan, Méloux, Maxime, Cho, Kyunghyun, Peyrard, Maxime

arXiv.org Artificial IntelligenceNov-25-2025

We propose a fully data-driven approach to designing mutual information (MI) estimators. Since any MI estimator is a function of the observed sample from two random variables, we parameterize this function with a neural network (MIST) and train it end-to-end to predict MI values. Training is performed on a large meta-dataset of 625,000 synthetic joint distributions with known ground-truth MI. To handle variable sample sizes and dimensions, we employ a two-dimensional attention scheme ensuring permutation invariance across input samples. To quantify uncertainty, we optimize a quantile regression loss, enabling the estimator to approximate the sampling distribution of MI rather than return a single point estimate. This research program departs from prior work by taking a fully empirical route, trading universal theoretical guarantees for flexibility and efficiency. Empirically, the learned estimators largely outperform classical baselines across sample sizes and dimensions, including on joint distributions unseen during training. The resulting quantile-based intervals are well-calibrated and more reliable than bootstrap-based confidence intervals, while inference is orders of magnitude faster than existing neural baselines. Beyond immediate empirical gains, this framework yields trainable, fully differentiable estimators that can be embedded into larger learning pipelines. Moreover, exploiting MI's invariance to invertible transformations, meta-datasets can be adapted to arbitrary data modalities via normalizing flows, enabling flexible training for diverse target meta-distributions.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2511.18945

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Testing for Families of Distributions via the Fourier Transform

Neural Information Processing SystemsNov-20-2025, 22:47:19 GMT

We study the general problem of testing whether an unknown discrete distribution belongs to a specified family of distributions. More specifically, given a distribution family P and sample access to an unknown discrete distribution D, we want to distinguish (with high probability) between the case that D in P and the case that D is ε-far, in total variation distance, from every distribution in P . This is the prototypical hypothesis testing problem that has received significant attention in statistics and, more recently, in computer science. The main contribution of this work is a simple and general testing technique that is applicable to all distribution families whose Fourier spectrum satisfies a certain approximate sparsity property. We apply our Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables (SIIRVs), Poisson Multinomial Distributions (PMDs), and Discrete Log-Concave Distributions. For the first two, ours are the first non-trivial testers in the literature, vastly generalizing previous work on testing Poisson Binomial Distributions. For the third, our tester improves on prior work in both sample and time complexity.

distribution, fourier transform, name change, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Data Science > Data Quality > Data Transformation (0.43)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Coupled Variational Bayes via Optimization Embedding

Neural Information Processing SystemsNov-20-2025, 17:11:56 GMT

Theoretically, we establish an interesting connection to gradient flow and demonstrate the extreme flexibility of this implicit distribution family in the limit sense.

artificial intelligence, machine learning, optimization, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Differentially Private Learning of Structured Discrete Distributions Ilias Diakonikolas University of Edinburgh Moritz Hardt Google Research Ludwig Schmidt MIT

Neural Information Processing SystemsOct-2-2025, 02:21:44 GMT

We investigate the problem of learning an unknown probability distribution over a discrete population from random samples. Our goal is to design efficient algorithms that simultaneously achieve low error in total variation norm while guaranteeing Differential Privacy to the individuals of the population. We describe a general approach that yields near sample-optimal and computationally efficient differentially private estimators for a wide range of well-studied and natural distribution families. Our theoretical results show that for a wide variety of structured distributions there exist private estimation algorithms that are nearly as efficient--both in terms of sample size and running time--as their non-private counterparts. We complement our theoretical guarantees with an experimental evaluation. Our experiments illustrate the speed and accuracy of our private estimators on both synthetic mixture models and a large public data set.

algorithm, approximation, privacy, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)

Add feedback

Statistical Test for Auto Feature Engineering by Selective Inference

Matsukawa, Tatsuya, Shiraishi, Tomohiro, Nishino, Shuichi, Katsuoka, Teruyuki, Takeuchi, Ichiro

arXiv.org Machine LearningOct-13-2024

Auto Feature Engineering (AFE) plays a crucial role in developing practical machine learning pipelines by automating the transformation of raw data into meaningful features that enhance model performance. By generating features in a data-driven manner, AFE enables the discovery of important features that may not be apparent through human experience or intuition. On the other hand, since AFE generates features based on data, there is a risk that these features may be overly adapted to the data, making it essential to assess their reliability appropriately. Unfortunately, because most AFE problems are formulated as combinatorial search problems and solved by heuristic algorithms, it has been challenging to theoretically quantify the reliability of generated features. To address this issue, we propose a new statistical test for generated features by AFE algorithms based on a framework called selective inference. As a proof of concept, we consider a simple class of tree search-based heuristic AFE algorithms, and consider the problem of testing the generated features when they are used in a linear model. The proposed test can quantify the statistical significance of the generated features in the form of $p$-values, enabling theoretically guaranteed control of the risk of false findings.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2410.19768

Country:

North America (0.14)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)

Genre:

Research Report > Experimental Study (0.51)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Testing for Families of Distributions via the Fourier Transform

Neural Information Processing SystemsOct-8-2024, 18:29:35 GMT

We study the general problem of testing whether an unknown discrete distribution belongs to a specified family of distributions. More specifically, given a distribution family P and sample access to an unknown discrete distribution D, we want to distinguish (with high probability) between the case that D in P and the case that D is ε-far, in total variation distance, from every distribution in P . This is the prototypical hypothesis testing problem that has received significant attention in statistics and, more recently, in computer science. The main contribution of this work is a simple and general testing technique that is applicable to all distribution families whose Fourier spectrum satisfies a certain approximate sparsity property. We apply our Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables (SIIRVs), Poisson Multinomial Distributions (PMDs), and Discrete Log-Concave Distributions.

distribution family, fourier transform, tester

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.43)
Information Technology > Data Science > Data Quality > Data Transformation (0.40)

Add feedback

Identifying Nonstationary Causal Structures with High-Order Markov Switching Models

Balsells-Rodas, Carles, Wang, Yixin, Mediano, Pedro A. M., Li, Yingzhen

arXiv.org Machine LearningJun-25-2024

Causal discovery in time series is a rapidly evolving field with a wide variety of applications in other areas such as climate science and neuroscience. Traditional approaches assume a stationary causal graph, which can be adapted to nonstationary time series with time-dependent effects or heterogeneous noise. In this work we address nonstationarity via regime-dependent causal structures. We first establish identifiability for high-order Markov Switching Models, which provide the foundations for identifiable regime-dependent causal discovery. Our empirical studies demonstrate the scalability of our proposed approach for high-order regime-dependent structure estimation, and we illustrate its applicability on brain activity data.

assumption, identifiability, linear independence, (15 more...)

arXiv.org Machine Learning

2406.17698

Country: